-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Add RLP library #5680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add RLP library #5680
Conversation
Co-authored-by: Hadrien Croubois <[email protected]>
Co-authored-by: Arr00 <[email protected]>
🦋 Changeset detectedLatest commit: 6efa28d The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (9)
.changeset/lovely-cooks-add.md (1)
5-5
: Call out new public utils surfaced by this PR.This PR introduces public APIs in Bytes (Accumulator) and Memory (Slice). Consider mentioning them so downstream users see the new utilities in release notes.
Apply this diff to amend the body:
-`RLP`: Add library for Ethereum's Recursive Length Prefix encoding/decoding. +`RLP`: Add library for Ethereum's Recursive Length Prefix encoding/decoding. + +Also introduces `Bytes.Accumulator` (a memory‑only bytes builder) and `Memory.Slice` +utilities used by `RLP` and available for general use.test/utils/Bytes.t.sol (2)
10-10
: Narrow the using directive to avoid global pollution.Limit extension scope to the types you actually use.
Apply:
- using Bytes for *; + using Bytes for bytes; + using Bytes for bytes[];
227-237
: Great golden‑path test for Accumulator.Covers push/shift order and flatten result. Consider an explicit empty‑accumulator assertion as well.
Add alongside existing tests:
function testAccumulatorEmpty() public pure { Bytes.Accumulator memory acc = Bytes.accumulator(); assertEq(acc.flatten(), hex""); }contracts/utils/Bytes.sol (2)
271-299
: Return value on push/shift is unnecessary.Since memory params are passed by reference for internal calls, returning
Accumulator memory
is redundant and invites accidental copies.Apply:
- function push(Accumulator memory self, bytes memory data) internal pure returns (Accumulator memory) { + function push(Accumulator memory self, bytes memory data) internal pure { // ... - return self; } @@ - function shift(Accumulator memory self, bytes memory data) internal pure returns (Accumulator memory) { + function shift(Accumulator memory self, bytes memory data) internal pure { // ... - return self; }Note: update call sites only if they captured the return (current tests don’t).
141-161
: concat() is clear and safe; optional micro‑opt.You could preallocate and copy in one Yul loop, but current version favors readability and is fine.
contracts/utils/Memory.sol (1)
93-94
: Clarify the out-of-bounds behavior in documentation.The comment mentions that part of the return value will be out of bounds but should be ignored. This could lead to misuse where callers accidentally process invalid data.
Consider either:
- Adding a bounds check to prevent reading beyond the slice
- Updating the documentation to explicitly warn about undefined behavior
/** * @dev Read a bytes32 buffer from a given Slice at a specific offset * - * Note:If offset > length(slice) - 32, part of the return value will be out of bound and should be ignored. + * WARNING: If offset > length(slice) - 32, the return value will contain undefined data from memory + * beyond the slice bounds. Callers MUST ensure offset + 32 <= length(slice) or handle partial reads appropriately. */contracts/utils/RLP.sol (3)
216-217
: Consider removing or documenting the use case forreadRawBytes
.The TODO comment questions the usefulness of this function. If there's no clear use case, consider removing it to reduce the API surface area.
Either document a specific use case or remove this function if it's not needed.
348-349
: Replace TODO comments with proper custom errors.Multiple locations use generic
require
statements with TODO comments indicating custom errors should be added for sanity checks.Add specific custom errors for these validation cases:
+/// @dev Invalid RLP encoding: single byte string must be >= 0x80 +error RLPInvalidSingleByteString(); + +/// @dev Invalid RLP encoding: length has leading zeros +error RLPInvalidLeadingZeros(); // Line 348-349 -require(bytes1(item.load(1)) >= bytes1(SHORT_OFFSET)); // TODO: custom error for sanity checks +if (bytes1(item.load(1)) < bytes1(SHORT_OFFSET)) revert RLPInvalidSingleByteString(); // Line 356 -require(bytes1(item.load(0)) != 0x00); // TODO: custom error for sanity checks +if (bytes1(item.load(0)) == 0x00) revert RLPInvalidLeadingZeros(); // Line 376 -require(bytes1(item.load(0)) != 0x00); +if (bytes1(item.load(0)) == 0x00) revert RLPInvalidLeadingZeros();Also applies to: 356-357, 376-376
271-292
: Add validation for currentOffset to prevent potential issues.After the loop at line 285, there's no validation that
currentOffset
equalsitemLength
. This could hide malformed RLP data.Add a validation check after the loop:
uint256 itemCount; for (uint256 currentOffset = listOffset; currentOffset < itemLength; ++itemCount) { (uint256 elementOffset, uint256 elementLength, ) = _decodeLength(item.slice(currentOffset)); list[itemCount] = item.slice(currentOffset, elementLength + elementOffset); currentOffset += elementOffset + elementLength; } + +// Ensure we consumed exactly the expected bytes +require(currentOffset == itemLength, RLPContentLengthMismatch(itemLength, currentOffset)); // Decrease the array size to match the actual item count.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.changeset/lovely-cooks-add.md
(1 hunks)contracts/mocks/Stateless.sol
(1 hunks)contracts/token/ERC20/extensions/ERC4626.sol
(1 hunks)contracts/utils/Bytes.sol
(2 hunks)contracts/utils/Memory.sol
(2 hunks)contracts/utils/README.adoc
(2 hunks)contracts/utils/RLP.sol
(1 hunks)test/utils/Bytes.t.sol
(2 hunks)test/utils/Memory.t.sol
(2 hunks)test/utils/RLP.t.sol
(1 hunks)test/utils/RLP.test.js
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/utils/RLP.test.js (1)
test/helpers/random.js (1)
generators
(3-10)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: halmos
- GitHub Check: coverage
- GitHub Check: slither
- GitHub Check: tests-upgradeable
- GitHub Check: tests
- GitHub Check: tests-foundry
🔇 Additional comments (16)
contracts/utils/README.adoc (2)
41-41
: Docs entry for RLP — LGTM.
142-143
: Nav token for RLP — ensure referenced page/anchor exists
LGTM; nav token present at contracts/utils/README.adoc:142 ({{RLP}}), but no corresponding RLP page/heading was found in docs/ or contracts/ — add or update the RLP page/anchor to avoid a broken docs reference.contracts/mocks/Stateless.sol (1)
47-47
: Importing RLP to exercise build graph — LGTM.If any linter complains about unused imports in mocks, consider referencing one symbol in Dummy1234 to silence it; otherwise fine.
test/utils/Bytes.t.sol (1)
239-249
: Property tests look good.These fuzz the concat equivalence both directions (push/shift). Nice coverage.
test/utils/Memory.t.sol (3)
6-6
: Bytes import for helpers — LGTM.
10-10
: Scoped using directives — LGTM.
24-37
: Solid invariants for Slice/asSlice/toBytes.Covers clamping and equivalence with Bytes.slice; good edge handling.
contracts/utils/Bytes.sol (1)
266-269
: Sentinel semantics — LGTM.Using 0x00 as an empty sentinel for head/tail is fine given EVM memory layout; just keep it documented.
.changeset/lovely-cooks-add.md (1)
2-2
: Confirm changeset package nameRepo contains packages "@openzeppelin/contracts" and "openzeppelin-solidity". .changeset/lovely-cooks-add.md targets "openzeppelin-solidity" — confirm you intended to bump that package; if you meant "@openzeppelin/contracts", rename the changeset header to match so the changeset is applied correctly.
contracts/token/ERC20/extensions/ERC4626.sol (1)
4-4
: Pragma bump to ^0.8.24 — not safe without repo-wide alignmentScan shows mixed Solidity pragmas across the repo (many files at ^0.8.20, some at ^0.8.26 and ^0.8.27) while the reviewed file is ^0.8.24.
- Action: either normalize all contracts to a single ^0.8.24+ pragma, or configure the build/CI to compile with multiple solc versions and verify CI.
- Examples from the scan: contracts/access/AccessControl.sol → ^0.8.20; contracts/account/extensions/draft-AccountERC7579.sol → ^0.8.26; contracts/utils/RLP.sol → ^0.8.27; contracts/token/ERC20/extensions/ERC4626.sol → ^0.8.24.
contracts/utils/Memory.sol (2)
47-47
: LGTM! Well-designed memory slice abstraction.The
Slice
type provides an efficient 256-bit packed representation combining length (upper 128 bits) and pointer (lower 128 bits), enabling zero-copy slicing operations.
3-3
: Resolved — pragma bump to ^0.8.24 is safe for local importersAll local files that import contracts/utils/Memory.sol declare pragma ^0.8.24 or newer — no change required.
Importers found:
- contracts/utils/Bytes.sol — pragma ^0.8.24
- contracts/utils/RLP.sol — pragma ^0.8.27
- contracts/token/ERC20/extensions/ERC4626.sol — pragma ^0.8.24
- contracts/mocks/Stateless.sol — pragma ^0.8.26
Note: test/utils/Memory.t.sol and test/utils/RLP.t.sol import @openzeppelin/contracts/utils/Memory.sol and remain pragma ^0.8.20 (unrelated to the local file).
test/utils/RLP.test.js (1)
10-18
: LGTM! Clean resolution of function overloads.The approach to resolve overloaded
encode
functions follows the established pattern from Math.test.js, providing clear typed variants for testing.test/utils/RLP.t.sol (1)
39-41
: LGTM! Good defensive check for list length.The assumption correctly limits the fuzz input to prevent excessive memory allocation during testing.
contracts/utils/RLP.sol (2)
288-290
: LGTM! Efficient array resizing using assembly.The inline assembly efficiently resizes the array to the actual item count without reallocation, avoiding memory waste.
119-126
: LGTM! Clever and efficient boolean encoding.The assembly implementation efficiently encodes booleans using bitwise operations:
false
→ 0x80 (empty item),true
→ 0x01, which aligns with the RLP specification treating booleans as 0/1 integers.
contracts/utils/Bytes.sol
Outdated
assembly ("memory-safe") { | ||
result := mload(0x40) | ||
let ptr := add(result, 0x20) | ||
for { | ||
let it := mload(self) | ||
} iszero(iszero(it)) { | ||
it := mload(it) | ||
} { | ||
let buffer := mload(add(it, 0x20)) | ||
let length := mload(buffer) | ||
mcopy(ptr, add(buffer, 0x20), length) | ||
ptr := add(ptr, length) | ||
} | ||
mstore(result, sub(ptr, add(result, 0x20))) | ||
mstore(0x40, ptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: free memory pointer not 32‑byte aligned in flatten().
FMP must be rounded up to the next 32‑byte boundary. Writing mstore(0x40, ptr)
leaves it potentially unaligned and can corrupt subsequent allocations.
Apply this diff:
function flatten(Accumulator memory self) internal pure returns (bytes memory result) {
assembly ("memory-safe") {
result := mload(0x40)
- let ptr := add(result, 0x20)
+ let dataStart := add(result, 0x20)
+ let ptr := dataStart
for {
let it := mload(self)
} iszero(iszero(it)) {
it := mload(it)
} {
let buffer := mload(add(it, 0x20))
let length := mload(buffer)
mcopy(ptr, add(buffer, 0x20), length)
ptr := add(ptr, length)
}
- mstore(result, sub(ptr, add(result, 0x20)))
- mstore(0x40, ptr)
+ let total := sub(ptr, dataStart)
+ mstore(result, total)
+ // advance FMP to the next 32-byte boundary after the copied data
+ let newFmp := and(add(ptr, 31), not(31))
+ mstore(0x40, newFmp)
}
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
assembly ("memory-safe") { | |
result := mload(0x40) | |
let ptr := add(result, 0x20) | |
for { | |
let it := mload(self) | |
} iszero(iszero(it)) { | |
it := mload(it) | |
} { | |
let buffer := mload(add(it, 0x20)) | |
let length := mload(buffer) | |
mcopy(ptr, add(buffer, 0x20), length) | |
ptr := add(ptr, length) | |
} | |
mstore(result, sub(ptr, add(result, 0x20))) | |
mstore(0x40, ptr) | |
} | |
function flatten(Accumulator memory self) internal pure returns (bytes memory result) { | |
assembly ("memory-safe") { | |
result := mload(0x40) | |
let dataStart := add(result, 0x20) | |
let ptr := dataStart | |
for { let it := mload(self) } iszero(iszero(it)) { it := mload(it) } { | |
let buffer := mload(add(it, 0x20)) | |
let length := mload(buffer) | |
mcopy(ptr, add(buffer, 0x20), length) | |
ptr := add(ptr, length) | |
} | |
let total := sub(ptr, dataStart) | |
mstore(result, total) | |
// advance FMP to the next 32-byte boundary after the copied data | |
let newFmp := and(add(ptr, 31), not(31)) | |
mstore(0x40, newFmp) | |
} | |
} |
🤖 Prompt for AI Agents
In contracts/utils/Bytes.sol around lines 302 to 317 the free memory pointer
stored with mstore(0x40, ptr) may be left unaligned after writing flattened
bytes; update the final free-memory store to round ptr up to the next 32-byte
boundary before storing it (i.e., compute alignedPtr = (ptr + 31) & ~31 and
mstore(0x40, alignedPtr)) so the FMP remains 32-byte aligned and subsequent
allocations are safe.
contracts/utils/Memory.sol
Outdated
// TODO: Fail if len or ptr is larger than type(uint128).max ? | ||
assembly ("memory-safe") { | ||
result := or(shl(128, len), ptr) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add bounds validation for length and pointer values.
The TODO comment correctly identifies a potential issue. Values exceeding type(uint128).max
would cause silent truncation when packed into the 128-bit fields.
Apply this fix to add validation:
function _asSlice(uint256 len, Memory.Pointer ptr) private pure returns (Slice result) {
- // TODO: Fail if len or ptr is larger than type(uint128).max ?
+ require(len <= type(uint128).max, "Length exceeds uint128 max");
+ require(uint256(asBytes32(ptr)) <= type(uint128).max, "Pointer exceeds uint128 max");
assembly ("memory-safe") {
result := or(shl(128, len), ptr)
}
}
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In contracts/utils/Memory.sol around lines 58-62, add explicit bounds checks
before the assembly packing so neither len nor ptr exceed type(uint128).max; if
either value is > type(uint128).max revert with a clear error (or custom error)
to avoid silent truncation, then safely cast to uint128 (or mask) and perform
the existing assembly pack. Ensure both checks run on the original uint256
inputs and use descriptive revert messages or a custom error name.
// const invalidTests = [ | ||
// { name: 'short string with invalid length', input: '0x8100' }, | ||
// { name: 'long string with invalid length prefix', input: '0xb800' }, | ||
// { name: 'list with invalid length', input: '0xc100' }, | ||
// { name: 'truncated long string', input: '0xb838' }, | ||
// { name: 'invalid single byte encoding (non-minimal)', input: '0x8100' }, | ||
// ]; | ||
|
||
// invalidTests.forEach(({ name, input }) => { | ||
// it(`encodes ${name} into invalid RLP`, async function () { | ||
// const item = await this.mock.$toItem(input); | ||
// await expect(this.mock.$decodeBytes_bytes(item)).to.be.reverted; | ||
// }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider uncommenting and implementing the invalid RLP tests.
The commented test cases for invalid RLP encoding scenarios would provide valuable coverage for error handling paths. These edge cases are important for security.
Would you like me to help implement these invalid RLP test cases? They would strengthen the test suite by validating proper error handling for malformed inputs.
🤖 Prompt for AI Agents
In test/utils/RLP.test.js around lines 136 to 149, the invalid RLP test cases
are commented out; re-enable and implement them so malformed encodings assert
that decoding reverts. Restore the invalidTests array and its forEach block,
convert each hex string input into an item via this.mock.$toItem(input) inside
each it() and assert await
expect(this.mock.$decodeBytes_bytes(item)).to.be.reverted; ensure each it() has
a descriptive name using the test case name and run within the same async
function context as the existing tests.
No dependency changes detected. Learn more about Socket for GitHub. 👍 No dependency changes detected in pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments in Memory, I will continue reviewing the RLP and Accumulators library later
/// @dev Get a slice representation of a bytes object in memory | ||
function asSlice(bytes memory self) internal pure returns (Slice result) { | ||
assembly ("memory-safe") { | ||
result := or(shl(128, mload(self)), add(self, 0x20)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency
result := or(shl(128, mload(self)), add(self, 0x20)) | |
result := or(shl(0x80, mload(self)), add(self, 0x20)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curios if the 0x80 selection, is it just to fit the length and pointer together.
If so, I wonder if 128 bits is the right choice. It may be too large of a number regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want slices to be represented using a bytes32 an not a structure. That way we don't have to allocate memory for slice, they live on the (machine) stack like any other variable.
That means we have to fit 2 values in 32 bits. These values are "length" and "offset/ptr". Both will not realistically ever be bigger than 64 bits, but since I have 256 bits of space, I just slip that in 2 ans say "lets use 128 bits for each value".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, our today most real machines use 64bits memory space. That means all the memory (including virtual memory) of all the apps running on our machine (include AI training, video encoding, 3D rendering, ...) is addressed by the OS using only 64bits (ie 8 bytes).
That represents 16777216 Tib of memory space.
32bits machine could handle 4Gib of memory space, which is very small by today's standard (for a PC). I don't expect smart contracts to handle that amount of memory in a TX in the near future, but I guess I could be surprised by this happening sooner than I expect.
|
||
/// @dev Returns the memory location of a given slice (equiv to self.offset for calldata slices) | ||
function _pointer(Slice self) private pure returns (Memory.Pointer result) { | ||
assembly ("memory-safe") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about the memory safe annotations since a slice may be arbitrary. In regular circumstances it should be fine, but we can't trust everyone will use it correctly. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I share this concern, however:
- Slices that are built using
asSlice(bytes memory)
as safe. - Slices that are build using
slice(Slice, uint256)
orslice(Slice, uint256, uint256)
are also safe.
So basically, the only way to build an "unsafe" slice is to do it manually by doing Slice.wrap(...)
, which we don't provide an helper for.
Note that is a slice is unsafe (pointer at unallocated location or length extends it beyond the allocated location), then _pointer
and length
will be safe. They don't access the memory (they just do arithmetics on the bytes32). The unsafe functions would be load(Slice, uint256)
that does an mload
and toBytes(slice)
that does an mcopy
// OpenZeppelin Contracts (last updated v5.4.0) (token/ERC20/extensions/ERC4626.sol) | ||
|
||
pragma solidity ^0.8.20; | ||
pragma solidity ^0.8.24; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we haven't release this file, I just noticed it may be clever to follow the same pattern as in ERC7579Utils where we define the type at the file level and then we create free functions to overload operators:
// Operators
using {eqPointer as ==} for Memory.Pointer global;
/// @dev Compares two `Pointer` values for equality.
function eqPointer(Memory.Pointer a, Memory.Pointer b) pure returns (bool) {
return Memory.asBytes32(a) == Memory.asBytes32(b);
}
Maybe overengineered but just mentioning before it becomes a breaking change. Note it would populate the global namespace for Pointer
uint256 lengthLength = prefix - SHORT_OFFSET - SHORT_THRESHOLD; | ||
|
||
require(itemLength > lengthLength, RLPInvalidDataRemainder(lengthLength, itemLength)); | ||
require(bytes1(item.load(0)) != 0x00); // TODO: custom error for sanity checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe panic with RESOURCE_ERROR
code?
Requires:
Memory
utility library #5189reverseBits
operations to Bytes.sol #5724clz(bytes)
andclz(uint256)
functions #5725PR Checklist
npx changeset add
)